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Abstract 

We study a social network consisting of agents organized as a hierarchical M-ary rooted tree, common 
in enterprise and miUtary organizational structures. The goal is to aggregate information to solve a binary 
hypothesis testing problem. Each agent at a leaf of the tree, and only such an agent, makes a direct 
measurement of the underlying true hypothesis. The leaf agent then generates a message and sends it 
to its supervising agent, at the next level of the tree. Each supervising agent aggregates the messages 
from the M members of its group, produces a summary message, and sends it to its supervisor at the 
next level, and so on. Ultimately, the agent at the root of the tree makes an overall decision. We derive 
upper and lower bounds for the Type I and Type II error probabilities associated with this decision with 
respect to the number of leaf agents, which in turn characterize the converge rates of the Type I, Type II, 
and total error probabilities. We also provide a message-passing scheme involving non-binary message 
alphabets and characterize the exponent of the error probability with respect to the message alphabet 
size. 
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I. Introduction 

We consider a binary hypothesis testing problem and an associated social network that attempts (jointly) 
to solve the problem. The network consists of a set of agents with interconnections among them. Each 
of the agents makes a measurement of the underlying true hypothesis, observes the past actions of his 
neighboring agents, and makes a decision to optimize an objective function (e.g., probability of error). 
In this paper, we are interested in the following questions: Will the agents asymptotically learn the 
underlying true hypothesis? More specifically, will the overall network decision converges in probability 
to the correct decision as the network size (number of agents) increases? If so, how fast is the convergence 
with respect to the network size? In general, the answers to these questions depend on the social network 
structure. There are two structures primarily studied in the previous literature. 

• Feedforward structure: Each agent makes a decision sequentially based on its private measurement 
and the decisions of some or all previous agents. For example, we usually decide on which restaurant 
to dine in or which movie to go to based on our own taste and how popular they appear to be with 
previous patrons. Investors often behave similarly in asset markets. 

• Hierarchical tree structure: Each agent makes a decision based on its private measurement and the 
decisions of its descendent agents in the tree. This structure is common in enterprises, military 
hierarchies, political structures, online social networks, and even engineering systems (e.g., sensor 
networks). 

The problem of social learning as described above is closely related to the decentralized detection 
problem. The latter concerns decision making in a sensor network, where each of the sensors is allowed to 
transmit a summarized message of its measurement (using a compression function) to an overall decision 
maker (usually called the fusion center). The goal typically is to characterize the optimal compression 
functions such that the error probability associated with the detection decision at the fusion center is 
miiumized. However, this problem becomes intractable as the network structure gets complicated. Much 
of the recent work studies the decentralized detection problems in the asymptotic regime, focusing on 
the problems of the convergence and convergence rate of the error probability. 

A. Related Work 

The literature on social learning is vast spanning various disciplines including signal processing, game 
theory, information theory, economics, biology, physics, computer science, and statistics. Here we only 
review the relevant asymptotic learning results in the two aforementioned network structures. 
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1) Feedforward Structure: Suppose that a set of agents make decisions sequentially about the un- 
derlying truth 9, which equals one of two hypotheses. The first agent makes a measurement of 6 and 
generates a binary decision di, which is observed by all the other agents. The second agent makes its 
decision d2 based on its own measurement and di. Recursively, the decision d^v of the A^th agent is 
based on its own measurement and the decisions observed from agents 1 to — 1. Banerjee |3| and 
Bikchandani et al. |4] show that in the case where the agent signals only allow bounded private belief; 
i.e., the likelihood-ratio of each signal is bounded, if the first two agents make the same decision, then the 
rest of the agents would simply copy this decision ignoring their own measurements, even if their own 
measurements indicate the opposite hypothesis. This interesting phenomenon is also known as herding. 
Moreover, we have limjv_j.oo IP(f^Af = G) <\, which means that the agent decisions do not converge in 
probability to the underlying true hypothesis as the number of agents goes to infinity; i.e., the agents 
cannot learn asymptotically. Smith and Sorensen Q show that if the agent signals allow unbounded 
private beliefs; i.e., the likelihood-ratio of each signal can be greater than any constant, then these agents 
learn asymptotically. In other words, the agent decisions converge in probability to the underlying true 
hypothesis: lim7v-s>oo IP(c^Af = 6) = \. Krishnamurthy 161, Q studies this problem from the perspective 
of quickest time change detection. A similar scenario where agents make decisions sequentially but each 
agent only observes the decision from its immediate previous agent (also known as tandem network) 
is considered in |[8l- lfT2l . VeeravalU ifTTI shows that the error probability converges sub-exponentially 
with respect to the number A^ of agents in the case where the private measurements are independent 
and identically Gaussian distributed. Tay et al. | [T2l show that the error probability in general converges 
sub-exponentially and derive a lower bound for the error probability. Djuric and Wang |[T3ll investigate 
the evolution of social belief in these structures. Lobel et al. |[T4l derive an upper bound for the error 
probability in the feedforward structure where each agent observes a decision randomly from all the 
previous agents. 

2) Hierarchical Tree Structure: In many relevant situations, the social network structure is very 
complicated, wherein each individual makes its decision not by learning from all the past agent decisions, 
but from only a subset of agents that are directly connected to this individual. For complex network 
structures, Jadbabaie et al. JTSl study the social learning problem from a non-Bayesian perspective. 
Acemoglu et al. llT6l provide some sufficient conditions for agents to learn asymptotically from a Bayesian 
perspective. Cattivelli and Sayed ifTTl study this problem using a diffusion approach. However, analyzing 
the convergence rate on learning for complex structures remains largely open. 

Recent studies suggest that social networks often exhibit hierarchical structures |[T8l - ll28]| . These 
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Structures naturally arise from the concept of social hierarchy, which has been observed and extensively 
studied in fish, birds, and mammals |[T8]| . Hierarchical structures can also be observed in networks of 
human societies |fT9l; for example, in enterprise organizations, military hierarchies, political structures 
II22II . and even online social networks 1261. 

In the special case where the tree height is 1, this structure is usually referred as the star configuration 
Il29l - ll46l . With the assumption of (conditional) independence of the agent measurements, the error 
probability in the star configuration converges exponentially with respect to the number N of agents. 
Tree networks with bounded height (greater than 1) are considered in Il47l - ll55l . In a tree network, 
measurements are summarized by leaf agents into smaller messages and sent to their parent agents, each 
of which fuses all the messages it receives with its own measurement (if any) and then forwards the new 
message to its parent agent at the next level. This process takes place throughout the tree, culminating 
at the root where an overall decision is made. In this way, information from each agent is aggregated 
at the root via a multihop path. Note that the information is 'degraded' along the path. Therefore, the 
convergence rate for tree networks cannot be better than that of the star configuration. More specifically, 
under the Bayesian criterion, the error probability converges exponentially fast to with an error exponent 
that is worse than the one associated with the star configuration fSll . 

The error probability convergence rate in trees with unbounded height was considered in ll56ll and ||57]| . 
We study in |[56l the error probability convergence rate in balanced binary relay trees, where each nonleaf 
agent in this tree has two child agents and all the leaf agents are at the same distance from the root. 
Hence, this situation represents the worst-case scenario in the sense that the minimum distance from the 
root to the leaves is the largest. We show that if each agent in the tree aggregates the messages from 
its child agents using the unit-threshold likelihood-ratio test, then we can derive tight upper and lower 
bounds for the total error probability at the root, which characterize the convergence rate of the total error 
probability. Kanoria and Montanari |[57l provide an upper bound for the convergence rate of the error 
probability in M-ary relay trees (directed trees where each nonleaf node has indegree M and outdegree 
1), with any combination of fusion rules for all nonleaf agents. Their result gives an upper bound on 
the rate at which an agent can learn from others in a social network. To elaborate further, the authors 
of II57I provide the following upper bound for the convergence rate of the error probability Pn with any 
combination of fusion rules: 

log2P^^ = 0(iV'°s-'^). (1) 
They also provide the following asymptotic lower bound for the convergence rate in the case of majority 
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dominance rule with random tie-breaking: 

In the case where M is odd, the majority dominance rule achieves the upper bound in ([1]), which shows 
that the bound is the optimal convergence rate. However, in the case where M is even, there exists a gap 
between these two bounds because of the floor function in the second bound. In this case, ll57l leaves 
two questions open: 

Ql. Does the majority dominance rule achieve the upper bound in ([T])? 
Q2. Do there exist other strategies that achieve the upper bound in ([T]l? 

In our paper, for the case where M is even, we answer the first question definitively by showing that the 
majority dominance rule does not achieve the upper bound in ([1]). For the second question, we provide 
a strategy that is closer to achieving the upper bound in ([T} than the majority dominance rule. 

Our paper also differs from (and complements) ||53 in a number of other ways. For example, our 
analysis also includes non-asymptotic results. Moreover, we also consider the Bayesian likelihood-ratio 
tes|^(the fusion rule for Bayesian learning) as an alternative fusion rule, not considered in ll57l . These 
differences should become clear as we clarify the contributions of this paper in the next section. 

B. Contributions 

In this paper, we consider the learning problem in social networks configured as M-ary relay trees. 
Each agent at the leaf level, and only such an agent, takes a direct measurement of the underlying truth 
and generates a message, which is sent to its parent agent. Each intermediate agent in the tree receives 
messages from its child agents and aggregates them into a new message, which is again sent to its parent 
agent at the next level. This process takes place at each nonleaf agent culminating at the root, where a 
final decision is made. In this way, the information from the leaf agents is aggregated into a summarized 
form at the decision maker at the root. This hierarchical structure is of interest because it represents the 
worst-case scenario in the sense that the leaf agents are maximally far away from the decision maker at 
the root. 

In the study of social networks, M-ary relay trees arise naturally. First, as pointed out before, many 
organizational structures are well described in this way. Also, it is well-known that many real-world 

'By the Bayesian likelihood-ratio test, we mean a likelihood-ratio test in which the threshold is given by the ratio of the prior 
probabilities. 
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social networks, including email networks ||58]| and the Internet |[59ll . are scale-free networks; i.e., the 
probability P(£) that i links are connected to a agent is P{1) ~ ci~^ , where c is a normalization constant 
and the parameter 7 G (2,3). In other words, the number of links does not depend on the network size 
and is bounded with high probability. Moreover, Newman et al. [60] show that the average degree in a 
social network is bounded or grows very slowly as the network size increases. Therefore, to study the 
learning problem in social networks, it is reasonable to assume that each nonleaf agent in the tree has a 
finite number of child agents, in which case the tree height grows unboundedly as the number of agents 
goes to infinity. 

In this paper, we study two ways of aggregating information: the majority dominance rule (a typical 
non-Bayesian rule) and the Bayesian likelihood-ratio test. Our contributions are as follows: 

1) In both cases, we have derived non-asymptotic bounds for the error probabilities with respect to the 
number of leaf nodes N. These bounds in turn characterize the asymptotic decay rates of the error 
probabilities. 

2) Suppose that the majority dominance rule with random tie-breaking is applied throughout the tree. 
In the case where M is even, we derive the exact convergence rate of the error probability: 

iog2P^i = e(ivi°g-L^J). 

Therefore, we show that the majority dominance rule with random tie -breaking does not achieve the 
upper bound in ([T]). (In the case where M is odd, our asymptotic decay rate is consistent with the 
result in ll57l .) 

3) Suppose that the Bayesian likelihood-ratio tests is applied. We show that the convergence rate of 
the error probability is 

log2P/ = 17(iVi°s-L^J). 

Therefore, the convergence rate in this case is not worse than that in the majority dominance case. 
Hence in the case where M is odd, the Bayesian likelihood-ratio test also achieves the upper bound 
in ^. 

4) In the case where M is even, we study an alternative majority dominance strategy, which achieves 
a strictly faster convergence rate than the majority dominance rule with random tie-breaking. The 
convergence rate of the total error probability using this strategy is 

iog2 = e(iv'°s- V^mTi)/2y 
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The upper bound in ([TJ involves an arithmetic mean of M + 2 and M. In contrast, the above rate 
involves the geometric mean of M + 2 and M. Therefore, the gap between this rate and the upper 
bound in ([T]) is small and almost negligible when M is large. We also show that the Bayesian 
likelihood-ratio test achieves this convergence rate under certain conditions. 
5) We propose a message-passing scheme involving non-binary message alphabets. We derive explicit 
convergence rates of the total error probabilities in the following cases: any combination of fusion 
rules, majority dominance rule with random tie-breaking, Bayesian likelihood-ratio test, and alterna- 
tive majority dominance rule. We also derive tight upper and lower bounds for the average message 
size as explicit functions of the spanning factor M. 

II. Problem Formulation 

We consider the problem of binary hypothesis testing between Hq and Hi, with Pq and Pi as the 
probability measures associated with the two hypotheses. The social network is organized as an M-ary 
relay tree shown in Fig. [T] in which leaf agents (circles) are agents making independent measurements 
of the underlying true hypothesis. Only these leaves have direct access to the measurements in the tree 
structure. These leaf agents then make binary decisions based on their measurements and forward their 
decisions (messages) to their parent agents at the next level. Each nonleaf agent, with the exception of 
the root, is a relay agent (diamond), which aggregates M binary messages received from its child agents 
into one new binary message and forwards it to its parent agent again. This process takes place at each 
agent, culminating at the root (rectangle) where the final decision is made between the two hypotheses 
based on the messages received. We denote the number of leaf agents by A^, which also represents the 
number of measurements. The height of the tree is log^f N, which grows unboundedly as the number of 
leaf agents goes to infinity. 

We assume that the decisions at all the leaf agents are independent given each hypothesis, and that 
they have identical Type I error probability (also known as false alarm probability, denoted by oq) and 
identical Type II error probability (also known as missed detection probability, denoted by /3o). In this 
paper, we answer the following questions about the Type I and Type II error probabilities: 

• How do they change as we move upward in the tree? 

• What are their explicit forms as functions of A^? 

• Do they converge to at the root? 

• If yes, how fast will they converge with respect to A^? 

For each nonleaf agent, we consider two ways of aggregating M binary messages: 



March 6, 2013 



DRAFT 



8 



(«„A) 




A? = M 

Fig. 1. An M-ary relay tree with height k. Circles represent leaf agents making direct measurements. Diamonds represent 
relay agents which fuse M binary messages. The rectangle at the root makes an overall decision. 



• In the first case, each nonleaf agent simply aggregates M binary messages into a new binary decision 
using the majority dominance rule (with random tie-breaking), which is a typical non-Bayesian fusion 
rule. This way of aggregating information is common in daily life (e.g., voting). For this fusion rule, 
we provide explicit recursions for the Type I and Type n error probabilities as we move towards the 
root. We derive bounds for the Type I, Type n, and total error probabilities at the root as explicit 
functions of N, which in turn characterize the convergence rates. 

• In the second case, each nonleaf agent knows the error probabilities associated with the binary 
messages received and it aggregates M binary messages into a new binary decision using the 
Bayesian likelihood-ratio test, which is locally optimal in the sense that the total error probability 
after fusion is minimized. We derive an upper bound for the total error probability, which shows 
that the convergence speed of the total error probabihty using this fusion rule is at least as fast as 
that using the majority dominance rule. 

III. Error Probability Bounds and Asymptotic Convergence Rates: Majority 

Dominance 

In this section, we consider the case where each nonleaf agent uses the majority dominance rule. We 
derive exphcit upper and lower bounds for the Type I, Type n, and total error probabilities with respect 
to N. Then, we use these bounds to characterize the asymptotic convergence rates. 

A. Error Probability Bounds 

We divide our analysis into two cases: oddary tree (M odd) and evenary tree (M even). In each case, 
we first derive the recursions for the Type I and Type II error probabilities and show that all agents at 
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level k have the same error probability pair {ak,Pk)- Then, we study the step- wise reduction of each 
kind of error probability. From these we derive upper and lower bounds for the Type I, Type II, and the 
total error probability at the root. 

1 ) Oddary Tree: We first study the case where the degree of branching M is an odd integer. Consider 
an agent at level k, which aggregates M binary messages u^~^ = Ug"^, . . . , u^^} from its child 

agents at level k — 1, where u^~^ G {0, 1} for all t. Suppose that u'^ is the output binary message after 
fusion, which is again sent to the parent agent at the next level. The majority dominance rule, when M 
is odd, is simply 

' 1, if Et=i^'-'>M/2, 
, 0, if Et=i ^t' < M/2. 
Suppose that the binary messages have identical Type I error probabihty a and identical 

Type II error probability /3. Then, the Type I and Type n error probability pair {a',P') associated with 
the output binary message is given by: 

/M\ ^^"^ 

a' = Po(u^ = 1) = n^oK^-^ = 1) + 1 F^i'^s-' = 0) n ^oiuf = 1) + . . . 
t=l ^ ^ t=l 

/ M \ (M+l)/2 

+ U-i)/2j n Po(uj-=o) n po(nr=i) 



= /(«), 

where f{a) := + {'^)a^-\l - a) + . . . + ((m^i)/2)«^'^^'^^'(1 " and 

M . . M-1 

p' = Pi(us = 0) = n^iK'"' = °) + 1 F^K'"' = 1) n ^iK'"' = 0) + . . . 
t=i ^ ^ t=i 

/ 1\I \ (M-l)/2 

= /(«■ 

We assume that all the binary messages from leaf agents have the same error probability pair (qq, /3o). 
Hence, all agent decisions at level 1 will have the same error probability pair after fusion: (ai,/3i) = 
(/(ao),/(/3o)). By induction, we have 

(ajfc+i, ^fe+i) = (/(afe), fiPk)), A; = 0, 1, ... , log^ N-1, 

where {at, (3k) represents the error probability pair for agents at the fcth level of the tree. Note that the 
recursions for and /3k are identical. Hence, it suffices to consider only the Type I error probability 
in deriving the error probability bounds. Before proceeding, we provide the following lemma. 
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Lemma 1: Let h^{x) = x'^ + {^)x''~^{l — x) + . . . + {'^) (1 — x)'^, where A; and M are integers. 
Suppose that < A; < M. Then, is a monotone decreasing function of x e (0, 1). 

Proof: We use induction in M to prove the claim. First we note that (x) = 1 for all M. 
Suppose that M = 2. Then, we have /if (x) = 2 — x. Suppose that M = 3. Then, we have hfi^x) = 
3 — 2x and /ill^c) = x"^ — 3x + 3. Clearly, in these cases are monotone decreasing functions of 
X € (0,1). 

Now suppose that hj, are monotone decreasing functions of a; G (0, 1) for all j = 2, . . . , m — 1 and 
k = 1, . . . ,j — 1. We wish to show that Kj^ are monotone decreasing functions of x G (0, 1) for all 
k = 1, . . . ,m — 1. We know that the binomial coefficients satisfy 

m — l\ fm — 2\ fm — 2 
m-l\ fm-2\ f k \ fk 

We apply the above expansion for all the coefficients in h^{x): 

Kix) =x^+ - ^) + ■ • • + (r) (1 - 

= x^+{^^x'-\l-x) + ...+ {^^{l-xf 

m-l\ k-l^-, ^ , , f^~Af, 



= 1 + (1 - ir)ftti(i) + . . . + (1 - 

m—1 

= l + (l-x)5;/ii_,(x). 

j=fc 

By the induction hypothesis, hl,_^ are monotone decreasing for all j = A;, . . . , m — 1. Moreover, it is easy 
to see that are positive for all j = A, . . . , m — 1. Therefore, because the product of two positive 
monotone decreasing functions is also monotone decreasing, is a monotone decreasing function of 
X G (0, 1). This completes the proof. ■ 
Next we will analyze the step-wise shrinkage of the Type I error probability after each fusion step. 
This analysis will in turn provide upper and lower bounds for the Type I error probabihty at the root. 
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Proposition 1: Consider an M-ary relay tree, where M is an odd integer. Suppose that we apply the 
majority dominance rule as the fusion rule. Then, for all k we have 

Proof: Consider the ratio of ak+i and a^^^^^^''^: 



First, we derive the lower bound of the ratio. We know that 



+ . . . + 



/(M - l)/2\ 
V(M-l)/2y'^ 

Moreover, it is easy to see that (Y) > (^*^^^''^^) for all A; = 1, 2, . . . , (M - l)/2. Consequently, we 
have afc+i/ct^,^^^^^^^ > 1. Next, we derive the upper bound of the ratio. By Lemma 1, we know that the 
ratio afc_|_i/a^*^^^^^^ is monotone increasing as — 0. Hence, we have 

afc+i ^{ M 



^{M+l)/2 - \i^M -l)/2 



The bounds in Proposition [T] hold for all au £ (0,1). Furthermore, the upper bound is achieved at 

M 
(M-l)/2 



the limit as — )• 0; i.e., limQ,^_>o a/c+i/a;[.^^^^''^^ = ((a/*i)/2)- Using the above proposition, we now 



derive upper and lower bounds for log2 a^^. 

Theorem 1: Consider an M-ary relay tree, where M is an odd integer. Let \m = (M + l)/2. Suppose 
that we apply the majority dominance rule as the fusion rule. Then, for all k we have 

Am (^log2 ao ^ - logs ) < log2 "fe ^ < ^li logs "o ^• 

Proof: From the inequalities in Proposition [ij we have ak+i = Ckof^'^^^^^"^ = CfcO^", where 



cfc e 



{{M~{)/2) 



. From these we obtain 



where c, G 



1 ( 



for all i, and 

logs «fc ^ = - logs Cfe-1 - Am logs Cfc-2 - ... - A^7^ logs co + A^/ logs "o ^• 
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Since log2 Q e 



0, log; 



M \ 
2 \{M-l)/2) 



-1 . f M 
log, a-, >-log2(^^^_^ 



Aa/ - 1 



)/2 
M 

(M - l)/2 
M 



, we have log2 ^ < A^^ log2 Oq ^ . Moreover, we obtain 

- Aa/ log 



M 

'2 (^(M-l)/2 



+ Ai^log2 



( (M ri)/2 j + «o ' > Ai/ [log2 «o ' - log2 _ i)/2 



M 



=Am (^log2 ao ^ - log2 (^^^) 



The bounds for log2 are similar and they are omitted for brevity. Note that our result holds for 
all finite integer k. In addition, our approach provides explicit bounds for both Type I and Type II error 
probabilities respectively. From the above results, we immediately obtain bounds at the root simply by 
substituting k = logjvf N into the bounds in Theorem [T] 

Corollary 1: Let Pf,n be the Type I error probability at the root of an Af-ary relay tree, where M is 
an odd integer. Suppose that we apply the majority dominance rule as the fusion rule. Then, we have 

M' ' 



N^og,,XM I log^oo^ -lo: 



3g2 



Aa/ 



< log2 Ppl; < iV'°S^^ lOf 



:2 "o 



2) Evenary Tree: We now study the case where M is an even integer and derive upper and lower 
bounds for the Type I error probabilities. The majority dominance rule in this case is 

„,fc-l 



< 



1, 

1 w.p. Pb, 
w.p. 1 - Pb, 

0, 



if Y^Zi^t' > M/2, 



If Efii-r 



M/2, 
M/2, 



if Efii<"<M/2, 



k-l 



where Pi, € (0,1) denotes the Bernoulli parameter for tie-breaking. We first assume that the tie-breaking 
is fifty-fifty; i.e., Pb = 1/2. We will show later that this assumption can be relaxed. The recursions for 
the Type I and Type II error probabilities are as follows: 



Ok 



\{< = 1) = n^o(^^' = ^) + 1 r^^'^'"' = °) n ^ 
i=i ^ ^ i=i 

, , Af/2 A//2 

+Kw2)n/o(nr^=o)npo(nr=i 



1) + ... 



9{ak-i), 
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Where := 1 + (f - + • • • + l^ui^o^'tHi^ " o^^-x^'^ and 

/3, = PiK^ = 0) = nPiK'-' = 0)+ Pi(nri = l) nPiK'"' = 0) + ... 

, , M/2 M/2 

+KM/2)n/-(""'=i>ri =°) 

= 5(/3fc-i)- 

Next we study the step-wise reduction of each type of error probabiUty when each nonleaf agent uses 
the majority dominance rule. Again it suffices to consider since the recursions are the same. 

Proposition 2: Consider an M-ary relay tree, where M is an even integer. Suppose that we apply the 
majority dominance rule as the fusion rule. Then, for all k we have 

- „M/2 - 2 {m/2J ■ 

The proof is given in Appendix A. The upper bound is achieved at the limit as — 0; i.e., 

lim„,^oafe+i/af^^ = (m/2)/2- 

In deriving the above results, we assumed that the tie-breaking rule uses Pt = 1/2. Suppose now that 
the tie is broken with Bernoulli distribution with some arbitrary probability G (0, 1). Then, it is easy 
to show that 

- M/2 - ^ ■ 

The bounds above are not as tight as those in Proposition |2] However, the asymptotic convergence rates 
remain the same as we shall see later. 

Next we derive upper and lower bounds for the Type I error probability at each level k. 

Theorem 2: Consider an Af-ary relay tree, where M is an even integer. Let Am = Af/2. Suppose that 
we apply the majority dominance rule as the fusion rule. Then, for all k we have 



Am (^log2 ^ - logs ) - log2 "fc ^ < Am logs "o ^• 



The proof is given in Appendix B. Similar to the oddary tree case, we can provide upper and lower 
bounds for the Type I error probability at the root. 
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Corollary 2: Let Pf,n be the Type I error probability at the root of an M-ary relay tree, where M is 
an even integer. Suppose that we apply the majority dominance rule as the fusion rule. Then, we have 

Remark 1: Notice that the above result is only useful when M > 4. For the case where M = 2 
(balanced binary relay trees), we have ak+i = al + ak{l - at) = au and Pk+i = /3| + - /3fc) = /3fc; 
that is, the Type I and Type 11 error probabilities remain the same after fusing with the majority dominance 
rule. 

Remark 2: We have provided a detail analysis in |[56l of the convergence rate of the total error 
probability in balanced binary relay trees (M = 2) using the unit-threshold likelihood-ratio test at every 
nonleaf agent. We show explicit upper and lower bounds for the total error probability at the root as 
function of the number N of leaf agents, which in turn characterizes the convergence rate \fN. Moreover, 
we show that the unit-threshold likelihood-ratio test, which is locally optimal, is close-to globally optimal 
in terms of the reduction in the total error probability (see ll6ll for details). 

Remark 3: Notice that the bounds in Corollaries [T] and [2] have the same form. Therefore, the odd and 
even cases can be unified if we simply let Xm = [{M + l)/2j- 

In the next section, we use the bounds above to derive upper and lower bounds for the total error 
probability at the root in the majority dominance rule case. 

3) Total Error Probability Bounds: In this section, we provide upper and lower bounds for the total 
error probability P/v at the root. Let ttq and vri be the prior probabilities for the two underlying hypotheses. 
It is easy to see that P/v = t^qPf.n + '^iPm,n , where Pf,n and Pm,n correspond to the Type I and Type 
II error probabilities at the root. With the bounds for each type of error probability in the case where the 
majority dominance rule is used, we provide bounds for the total error probability as follows. 

Theorem 3: Consider an M-ary relay tree, let Am = L(-^^ + l)/2j- Suppose that we apply the majority 
dominance rule as the fusion rule. Then, we have 

AtIosm Am J^iog2 max{ao, /^o}"' - logs ) < logs P^^ < N'^^^' ^« (ttq logs ao 1 + vn \og^ f3^'). 

Proof: From the definition of P/v; that is, P^ = 7roPF,N+T^iPM,N, we have P^ < max{Pp7v, Pm,n}- 
In addition, we know that and have the same recursion. Therefore, the maximum between the 
Type I and Type II error probabilities at the root corresponds to the maximum at the leaf agents. Hence, 
we have N^^^^m^m ('log2 max{ao, /3o}-' - loga {fj) < ^og^Pj^'. 
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By the fact that log2 a; ^ is a convex function, we have log2 Pjy^ < (ttq log2 Pp]\f + tti log2 Pi\,/n)- 
Therefore, we have log2 P^'^ < N^^^"' {ttq log2 ag ^ + vri log2 Pq^). ■ 

These non-asymptotic results are useful. For example, if we want to know how many measurements 
are required such that P/v < e, the answer is simply to find the smallest N that satisfies the inequality 
in Theorem [3} i.e., 

AtIosmAm |^log2max{ao,/3o}-' -log2 (^^^)) > logs 6"^ 



Hence we have 

log2 e"^ 



M 



N > 



log2 max{ao, Po} ^ - log2 [xi] 



The growth rate for the number of measurements is G((log2 e ^f^^^M ^^). 

B. Asymptotic Convergence Rates 

In this section, we study the convergence rates of error probabilities in the asymptotic regime as 
N ^ oo. We use the following notation to characterize the scaling law of the asymptotic decay rate. 
Let j and h be positive functions defined on positive integers. We write j{N) = 0{h{N)) if there exists 
a positive constant ci such that j{N) < cih{N) for sufficiently large A^. We write j(N) = Q{h{N)) 
if there exists a positive constant C2 such that j{N) > C2h{N) for sufficiently large N. We write 
j{N) = e{h{N)) if j(iV) = 0{h{N)) and j{N) = n{h{N)). 

From Corollaries [T] and [2j we can easily derive the decay rates of the Type I and Type II error 
probabilities. For example, for the Type I error probability, we have the following. 

Proposition 3: Consider an Af-ary relay tree, let Xm = [{M + l)/2j. Suppose that we apply the 
majority dominance rule as the fusion rule. Then, we have log2 Pp]^ = G(A^'°S'>^ ^'^). 

Proof: To analyze the asymptotic rate, we may assume that ao is sufficiently small. More specifically, 
we assume that qq < In this case, the bounds in Corollaries [ij and jl] show that log2 Pp]^ = 

Remark 4: Note that logA/ Am is monotone increasing with respect to M. Moreover, as M goes to 
infinity, the limit of log^/ Am is 1 . That is to say, when M is very large, the decay is close to exponential, 
which is the rate for star configuration and bounded-height trees. In terms of tree structures, when M is 
very large, the tree becomes short, and therefore achieves similar performance to that of bounded-height 
trees. 

Remark 5: From the fact that the Type I and Type II error probabilities follow the same recursion, it 
is easy to see that the Type II error probability at the root also decays to with exponent N^°^^' . 
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Next, we compute the decay rate of the total error probabiUty. 

Corollary 3: Consider an M-ary relay tree, let Xm = [(M+l)/2j. Suppose that we apply the majority 
dominance rule as the fusion rule. Then, we have log2P^^ = Q^N^'^^'^' ^"). 

For the total error probability at the root, we have similar arguments with that for individual error 
probabilities. For large M, the decay of the total error probability is close to exponential. 

IV. Error Probability Bounds and Asymptotic Convergence Rates: Bayesian 

Likelihood-ratio test 

In this section, we consider the case where the Bayesian likelihood-ratio test is used as the fusion rule. 
We derive an upper bound for the total error probability, which in turn characterizes the convergence 
rate. We show that the convergence rate in this case is at least as fast or faster than that with the majority 
dominance rule. 

Theorem 4: Let P^r be the total error probability at the root in the case where the Bayesian likelihood- 
ratio test is used as the fusion rule in M -ary relay trees. We have 

'2(/^'Mmax(7ro,^i)\\ 



log2 P^^ > iVl°S- ^« ( log2 Lo 1 - log2 ( ^ 



min(7ro, TTi)'^*^ / / 

Proof: In the case where the majority dominance rule is used, from Propositions [T] and |2} it is easy 
to show that 



Since x'^" is a convex function for all M > 2, we have 

Qfc + Pk y I Oik+ Pk 



which implies the following: 



)-Am+1 < "fc ^ Pk ^ ^ 



{ak + h)^" 



2-Am < «fc+i + Pk+i ^2^^ 



Hence, we obtain 

From these bounds and the fact that min(7ro, 7ri)(afc + /3jt) < vroafc + vri/3fc < max(7ro, 7ri)(afc + (3^), we 
have 

2-AMmin(7ro,7ri) ^ TTpak+i + TTi^k+i ^ ^(^f^) max(7ro, vn) 



max(7ro, vri)'*'^^ {TTQak + nifSk)^" min(7ro, vri)'^*'^ 
Note that vroafc + iriPk is the total error probability for agents at level k and we denote it by L^. 
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The Bayesian likelihood-ratio test is the optimal rule in the sense that the total error probability is 
minimized after fusion. Let L^^'^ be the total error probability after fusing with the Bayesian likelihood- 
ratio test. We have 

^^^i%^m ^ 2(amax(vro,vri) ^ 
L^" L^" min(7ro, 7ri)'^Af 
Using a similar approach as that used in proving Theorem [T] we can derive the following lower bound 
for log2 P]^^: 

1 1 => / 1 /2(*^)max(7ro,^i)\\ 

log2 > iVi°s- log2 L,^ - log2 . 

\^ \^ min(7ro,7ri)-^« j j 

■ 

From the above bound, we immediately obtain the following. 

Corollary 4: Consider an M-ary relay tree, and let \m = [(Af + l)/2j. Suppose that we apply the 
Bayesian likelihood-ratio test as the fusion rule. Then, we have log2 P^^ = i7(A^^°Sj>^ '^"). 

Note that in the case where the majority dominance rule is used, the convergence rate is exactly 
0(A^'°S" ^'^). Therefore, the convergence rate for the Bayesian likelihood-ratio test is at least as good 
as that for the majority dominance rule. 

V. Asymptotic Optimality of Fusion Rules 

In this section, we discuss the asymptotic optimality of the two fusion rules considered in our paper 
by comparing our asymptotic convergence rates with those in ||57l, in which it is shown that with any 
combination of fusion rules, the convergence rate is upper bounded as 

\og^P^^ =0{N'°^'"'^). (2) 

A. Oddary case 

In the oddary tree case, if each nonleaf agent uses the majority dominance rule, then the upper bound 
in Q is achieved; i.e., 

logaP^i = e(iV'°SML^J) = e(iv^ogM 



This result is also mentioned in Il57ll . Tay et al. [51 J find a similar result in bounded-height trees; that is, 
if the degree of branching for all the agents except those at level 1 is an odd constant, then the majority 
dominance rule achieves the optimal exponent. 

Now we consider the case where each nonleaf agent uses the Bayesian likelihood-ratio test. Since the 
convergence rate for this fusion rule is at least as good as that for the majority dominance rule, it is evident 
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that the Bayesian likelihood-ratio test, which is only locally optimal (the total error probability after each 
fusion is minimized), achieves the globally optimal convergence rate. This result is also of interest in 
decentralized detection problems, in which the objective is usually to find the globally optimal strategy. 
In oddary trees, the myopically optimal Bayesian likelihood-ratio test, which is relevant to social learning 
problems because of the selfishness of agents, is essentially globally optimal in terms of achieving the 
optimal exponent. 

Remark 6: Suppose that each nonleaf agent uses the Bayesian likelihood-ratio test and we assume 
that the two hypotheses are equally likely. In this case, the output message is give by the unit-threshold 
likelihood-ratio test: 

nfiiPi(nr) ^^^ 
nf=iPoK^-i)^o ■ 

If the Type I and Type II error probabilities at level are equal; i.e., oq = /3o> then the unit-threshold 
likelihood-ratio test reduces to the majority dominance rule. The bounds for the error probabilities in this 
case and those in the majority dominance rule case are identical. 

B. Evenary case 

In the evenary tree case, our results show that with the majority dominance rule, we have 

log^P^^ = G(Af^°§«L^J) = e(Ari°SM f ). (3) 

This characterizes the explicit convergence rate of the total error probability (c.f. |57|, in which there is 
a gap between the upper and lower bounds for log2 i^^^)- It is evident that the majority dominance rule 
in this evenary tree case does not achieve the upper bound in (|2]). However, the gap between the rates 
described in (|2]) and Q becomes smaller and more negligible as the degree M of branching grows. 

In the case of binary relay trees (M = 2), the gap is most significant because the total error probability 
does not change after fusion with the majority dominance rule. In contrast, we have shown in [56] that 
the likelihood-rate test achieves convergence rate ^/N . For M > 4, we have shown that the convergence 
rate using the Bayesian likelihood-ratio test is at least as good as that using the majority dominance rule. 

Now we consider the case where the alternative majority dominance strategy (tie is broken alternatively 
for agents at consecutive levels) is used throughout the tree. In this case we have 

= "^1 + - + • • • + {m/^'-I'-I^^ - 

and 

Qfc+i = afc+( la^ (l-afe) + ...+ ( la^ (1 - Ofc) ' 
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Using Lemma 1, it is easy to show that 



Oil. 

1 < < 



M 
M/2 - 1 



(4) 



Theorem 5: Consider an M-ary relay tree, where M is an even integer, and let Am = M/2. Suppose 
that we apply the alternative majority dominance strategy. Then, for even k we have 



(Am + if' 



log2 «o ^ - log2 ) < log2 < Ajf (Am + if'^ \og^ a^'- 



Proof: The case where Af = 2 is easy to show using the recursion for and the proof is omitted. 
Now let us consider the case where M > 4. From the inequalities in Q, we have 



where Ck-i and Ck G 



1 ( M \ 
^' \M/2) 



"fc — Cfc-lC^_2Cj!j_3 



From these we obtain 



On 



where a £ 



^' \M/2) 



for all i. 



log2 a^' = - log2 cfe„i - ... - Am''/'(Am + 1)''/'"' log2 CO + Am'=/'(Am + 1)'/' log2 Oq-^ 

Since log2 Q G 0, log2 {m/2) ' we have log2a^i < Am''/^(Am + l)''/^log2 ag . Moreover, we have 
log2 Ci < log2 {m/2)- Hence, 

log2 a^' > - log2 (^^_^) (1 + Am + Am(Am + 1) + . . . + A^f (Am + 1)'/'"') 

+ AM'/'(AM + l)'/'log2ao-i. (5) 

Next we use induction to show that 

1 + Am + Am(Am + 1) + . . . + Ajf (Am + l)''^^'' < A^f (Am + l)''^^- (6) 

Suppose that k = 2. Then, we have 1 + Am < AM(Ajvf + 1), which holds because \m > 2. Suppose that 
([6]) holds when k = k^. We wish to show that it also holds when k = k^ + 1, in which case we have 

1 + Am + . . . + Ajf (Am + 1)'=°/^-^ + A^f (Am + + \T^\\m + if'^" 

< 2Atf (Am + 1)'"/' + \T^\\m + If'^'^ 

< 2Ajf +\Am + < Ajf +\Am + 

Therefore, we have proved (|6]l. Substituting this result in Q, we obtain the desired lower bound. ■ 
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The bounds for log2 are similar and they are omitted for brevity. 

Corollary 5: Let Pf,n be the Type I error probability at the root of an M-ary relay tree, where M is 
an even integer. Suppose that we apply the alternative majority dominance strategy. Then, we have 

^log,, V^u(mT^/2 l^i^g^ ^^-1 _ i^g^ M ^ ^^^^ ^ V^g(MT2)/2 ^^g^ ^-i. 

Corollary 6: Let P/v be the total error probability at the root of an M-ary relay tree, where M is 
an even integer. Suppose that we apply the alternative majority dominance strategy. Then, we have 
log2P^jy = e(AriogMVA^(A^+2)/2) and log^P^' = @(^n'°^m ^JmM^)/2^_ 

Note that when M = 2, log2P^^ = 0(\/iV)- Therefore, the decay rate with this strategy is identical 
with that using the Bayesian likelihood-ratio test. This is not surprising because we show in \5E\ that 
the Bayesian likelihood-ratio test is essentially either 'AND' rule or 'OR' rule depending on the values 
of the Type I and II error probabilities. We also show that the same rule will repeat no more than two 
consecutive times. Therefore, the decay rate in this case is the same as that using the alternative majority 
dominance strategy. 

For the case where M > 4, suppose that ao and /3o are sufficiently small and their difference is also 
sufficiently small. Then, it is easy to show that the Bayesian likelihood-ratio test is majority dominance 
rule with tie-breaking given by the values of the Type I and II error probabilities. Moreover, we can 
show that the same tie-breaking will repeat no more than two consecutive times. In this case, the error 
probability decays as Q{N^°fiM y/MM+2)/2y 

Recall that the upper bound for the decay rate of the total error probability with all combinations of 
fusion rules is 0{N^°^''' ^^), which involves an arithmetic mean of M + 2 and M. In contrast, the 
decay rate using the alternative majority dominance strategy and Bayesian likelihood-ratio test involves 
the geometric mean of M + 2 and M, which means that these two strategies are almost asymptotic 
optimal, especially when M is large. 

In addition, the rate using the alternative majority dominance strategy is better comparing to the random 
tie-breaking case. For illustration purposes, in Fig. [2] we plot the exponent for the decay rate of the total 
error probability versus the spanning factor M in these two cases. For comparison purposes, we also 
plot the exponent in the upper bound (|2]l. We can see from Fig. |2] that alternative majority dominance 
strategy achieves a larger exponent than that of the majority dominance rule with random tie-breaking. 
Moreover, the gap between the exponents in the alternative majority dominance strategy case and the 
upper bound (|2]) is small and almost negligible. 
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Fig. 2. Plot of error exponents versus the spanning factor M. Dashed (red) line represents the alternative majority dominance 
strategy. Dotted (blue) line represents the majority dominance rule with random tie-breaking. Solid (black) line represents the 
exponent in (|2}. 



VI. Non-binary Message Alphabets 

In the previous sections, each agent in the tree is only allowed to pass a binary message to its supervising 
agent at the next level. A natural question is, what if each agent can transmit a 'richer' message? In 
this section, we provide a message-passing scheme that allows general message alphabet of size V (non- 
binary). We call this M-ary relay tree with message alphabet size V an (M, 2?)-tree. We have studied 
the convergence rates of (M, 2)-trees by investigating how fast the total error probability decays to 0. 
What about the convergence rate when V is an arbitrary finite integer? 

We denote by the output message for each agent at the fcth level after fusing M input messages 
u,^~^ = {u\~^,U2~^, ti^7^} from its child agents at the {k - l)th level, where u'l'^ G {0, 1, . . . , P} 
for all t E {1,2, ...,M}. 

Case I: First, we consider an (M, 'D)-tree with height ko, in which there are M'^" leaf agents, and the 
message alphabet size is sufficiently large; more precisely, 

V > M^°"^ + 1. (7) 

For our analysis, we need the following terminology: 
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Definition: Given a nonleaf agent in the tree, a subtree leaf of this agent is any leaf agent of the subtree 
rooted at the agent. An affirmative subtree leaf is any subtree leaf that sends a message of '1' upward. 

Suppose that each leaf agent still generates a binary message G {0, 1} and sends it upward to 
its parent agent. Moreover, each intermediate agent simply sums up the messages it receives from its 
immediate child agents and sends the summation to its parent agent; that is, = X]*=i Then we 

can show that the output message for each agent at the /cth level is an integer from {0, 1, ... , M^} for 
all k G {0, 1, . . . , /cq — 1}- Moreover, this message essentially represents the number of its affirmative 
subtree leaf. 

Because of inequality (|7]), at each level k in the tree, the message alphabet size V is large enough to 
represent all possible values of (k G {0, . . . , /cq — 1})- In particular, the root (at level /cq) knows the 
number of its affirmative subtree leaves. In this case, the convergence rate is the same as that of the star 
configuration, where each leaf agent sends a binary message to the root directly. Recall that in the star 
configurations, the total error probability decays exponentially fast to 0. 

Case II: We now consider the case where the tree height is very large; i.e., (|7]) does not hold. As shown 
in Fig. |3] we apply the scheme described in Case I; that is, the leaf agents send binary compressions of 
their measurements upward to their parent agents. Moreover, each intermediate agent simply sends the 
sum of the messages received to its parent agent; i.e., 

M 

u'o = Y.^'-'- (8) 

t=i 

From the assumption of large tree height, it is easy to see that the message alphabet size is not large 
enough for all the relay agents to use the fusion rule described in ([8]). With some abuse of notation, we 
let /cq to be the integer k^ = [logj^/(P — 1)J + 1 (here, k^ is not the height of the tree; it is strictly less 
than the height). Note that M^'>-^ + 1 < P < M^" + 1. 

From the previous analysis, we can see that with this scheme, each agent at the A;oth level knows 
the number of its affirmative subtree leaves. Therefore, it is equivalent to consider the case where each 
agent at level k^ connects to its M^" subtree leaves directly (all the intermediate agents in the subtree 
can be ignored). However, we cannot use the fusion rule described in ([8]) for the agents at A;oth level to 
generate the output messages because the message alphabet size is not large enough. Hence, we let each 
agent at level ko aggregate the binary messages from its subtree leaves into a new binary message 
(using some fusion rule). By doing so, the output message from each agent at the fcoth level is binary 
again. Henceforth, we can simply apply the fusion rule ([8]) and repeat this process throughout the tree, 
culminating at the root. We now provide an upper bound for the asymptotic decay rate in this case. 
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{0,1,..., 

{0,1,.. .,M) 
(0,1} 




M 



Fig. 3. A message-passing scheme for non-binary message alphabets in an M-axy relay tree. 



Theorem 6: The convergence rate of the total error probabiUty for an (M, 'D)-tree is equal to that for 
an {M^° , 2)-tree, where = \\og]yj{V — 1)J + 1. In particular, let P/v be the total error probability at 
the root for an (M, D)-tree. With any combination of fusion rules at level Iko, I = 1,2,..., we have 
log2 = O (NP) , where 

_ ln(M'=" + 1) _ logA/ 2 
^ '~ InM^o ko 

Proof: Consider an (M, P)-tree with the scheme described above. It is easy to see that equivalently 

we can consider a tree where the leaf agents connect to the agents at the koth level directly. In addition, 

because of the recursive strategy applied throughout the tree, it suffices to consider the tree where the 

agents at the ikoth level connect to the agents at the (£+ l)A;oth level directly for all non-negative integers 

£. Therefore, the convergence rate of an (M, D)-tree is equal to that of the corresponding (M'^", 2)-tree. 

In the asymptotic regime, the decay rate in (M, 2)-trees is bounded above as follows ||57l : 

log2P^^ = 0(iV'°s-^). 

Therefore, the decay rate for (M'^", 2)-trees is also bounded above as 

which upon simplification gives the desired result. ■ 
Suppose that each agent at level iko for all i uses the majority dominance rule. Then, we can derive 

the convergence rate for the total error probability as follows. 

Theorem 7: Consider (M, 'D)-trees where the majority dominance rule is used. Let ko = [log^.f{V — 
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1)J + 1. We have loga P^^ = G (N^^) , where 



■^^i^-i^, if Mis odd, 



g :-- 



In Af '=0 



if M is even. 



Proof: By Theorem [6| the performance of (M, P)-trees is equal to that of (M'^", 2) -trees, where 
ko = [logn,j{V — 1)J + 1. For the asymptotic rate, we have 

which upon simpUfication gives the desired result. ■ 
Remark 7: Notice that limM^oo ln(M^'' + l)/lnM'^'' = 1, which means that the even and odd cases 

in the expression for g are similar when M is large. 

Remark 8: From Theorem |7j we can see that with larger message alphabet size, the total error 

probability decays more quickly. However, the change in the decay exponent is not significant because 

kQ depends on V logarithmically. Furthermore, if M is large, then the change in the performance is less 

sensitive to the increase in V. 

Remark 9: Comparing the results in Theorems |6] and |7| we can see that the majority dominance rule 

achieves the optimal exponent in the oddary case and it almost achieves the optimal exponent in the 

evenary case. 

For the Bayesian likelihood-ratio test, we have the following result. 

Theorem 8: The convergence rate using the likelihood-ratio test is at least as good as that using the 
majority dominance rule; i.e., log2 P]^^ = (N^) . 

In the case where M is even, we can derive the decay rate using the alternative majority dominance 
strategy. 

Theorem 9: The convergence rate using the alternative majority dominance strategy is log2 P^^ = 
n {N") , where 

^ 1 / In(M^» + 2) \ _ logM 2 
^ 2 V InAf^o J ko 

Theorem 8 and 9 follow by applying the same arguments as those made in proofs of Corollary 4 and 
Theorem 6 and the proofs are omitted for brevity. 

The message-passing scheme provided here requires message alphabets with maximum size V. How- 
ever, most of the agents use much 'smaller' messages. For example, the leaf agents generate binary 
messages. It is interesting to characterize the average message size used in our scheme. Because of the 
recursive strategy, it suffices to calculate the average message size in a subtree with height ko — I since 
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the message sizes in our scheme repeat every ko levels. The message size (in bits) for agents at level 

t G {0, 1, . . . , fco — 1} is log2(M* + 1) and the number of agents at level t is M'^""*. Therefore, the 

average size 6(fco) in bits used in our scheme is 

r _M'"> + . . . + Mlog2{M''"-^ + 1) _ X;t=o^M^°-*log2(M* + 1) 
^ ^' ~ M^o + M'^o-i + . . . + M ~ Ylt=o^ 

We have 

log2(M* + 1) > log2 M* = t log2 M 

and 

log2(M* + 1) < log2(2M*) = 1 + t log2 M 

for all t > 1. Therefore, the average size in bits is lower bounded as 

TVf feo + log2 M + ... + M{ko - 1) log2 M 



b{ko) >- 



M^o + M^"-i + . . . + M 
M'=« ^ log2 M{M^{M'">-'^ - 1) - M(M - l){ko - 1)) 



"M^o + M'^"-! + . . . + A/ (ikf^o + M^«-i + . . . + M)(M - 1) 

Mf'o _ M^°-i ^ Mlog2 M M^«-i - 1 - M{M - l){ko - 1) 



M^o - 1 M - 1 M^o _ 1 

In addition, it is upper bounded as 

r,, , , M logo M M''»-^ -1- M(M -l)(ko-l) , log2 M 

b(ko) < IH — H — < 1 + — • 

^ ' M - 1 M'^o _ 1 - M - 1 

Recall that, with sufficiently large ko, the error probability convergence rates are close to exponential. 

However, from the above bounds the average message size in terms of bits in our scheme is still very 

small, specifically for sufficiently large /cq we have 

l + ^-^<6(^o)<l + '^. (9) 

Fig. |4] shows plots of the average message sizes b{ko) versus ko in the M = 10 and 20 cases. Note that 
as M increases, the average message size becomes smaller and the bounds in Q become tighter. 

VII. Concluding Remarks 

We have studied the social learning problem in the context of M-ary relay trees. We have analyzed the 
step-wise reductions of the Type I and Type II error probabilities and derived upper and lower bounds 
for each error probability at the root as explicit functions of A^, which characterize the convergence rates 
for Type I, Type II, and the total error probabilities. We have shown that the majority dominance rule 
is not better than the Bayesian likelihood-ratio test in terms of convergence rate. We have studied the 
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Fig. 4. (a) Average message size (dashed red line) in M = 10 case, (b) Average message size (dashed red line) in Al = 20 
case. The blue lines represent the bounds in ([9}. 



convergence rate using the alternative majority dominance strategy, which in turn shows that the majority 
dominance rule with random tie-breaking is suboptimal in the case where M is even. Last, we have 
provided a message-passing scheme which increases the convergence rate of the total error probability. 
We have shown quantitatively how the convergence rate varies with respect to the message alphabet sizes. 
This scheme is very efficient in terms of the average message size used for communication. 

Many interesting questions remain. Social networks usually involve very complex topologies. For 
example, the degree of branching may vary among different agents in the network. The convergence 
rate analysis for general complex structures is still wide open. Another question involves the assumption 
that the agent measurements are conditionally independent. It is of interest to study the scenario where 
these agent measurements are correlated. This scenario has been studied in the star configuration |[62l - 
|[64l but not in any other structures yet. Yet another question is related with the assumption that the 
communications and agents are perfectly reliable. We would like to study the rate of learning in cases 
where communications and agents are non-ideal |[65l . 
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Proof of Proposition [2] 

Proof: We consider the ratio of a^+i and ol^^'^'. 
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First, we show the lower bound of the ratio. We know that 
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1 



and > (^Y^) for all /c = 1, 2, . . . , M/2. Moreover, we have ( > (S/D = 1- I" consequence, 
we have ak+i/a^^^'^ > 1. Notice that ak+i/a^^^'^ = h^^^^{ak)/2 + h^^^^_-^{ak)/2. By Lemma 1, the 
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ratio is monotone increasing as — )• 0. Hence, we have a^+i/a^^^^ < 1/' 
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